Abstract
We compare two link analysis ranking methods of web pages in a site. The first, called Site Rank, is an adaptation of PageRank to the granularity of a web site and the second, called Popularity Rank, is based on the frequencies of user clicks on the outlinks in a page that are captured by navigation sessions of users through the web site. We ran experiments on artificially created web sites of different sizes and on two real data sets, employing the relative entropy to compare the distributions of the two ranking methods. For the real data sets we also employ a nonparametric measure, called Spearman's footrule, which we use to compare the top-ten web pages ranked by the two methods. Our main result is that the distributions of the Popularity Rank and Site Rank are surprisingly close to each other, implying that the topology of a web site is very instrumental in guiding users through the site. Thus, in practice, the Site Rank provides a reasonable first order approximation of the aggregate behaviour of users within a web site given by the Popularity Rank.
Similar content being viewed by others
References
Adamic, L.A., Huberman, B.A.: The web's hidden order. Commun. ACM 44(9), 55–59 (2001)
Berendt B., Mobasher, B., Spiliopoulou, M., Wiltshire, J.: Measuring the Accuracy of Sessionizers for Web Usage Analysis. In Proceedings of the Web Mining Workshop at the First SIAM International Conference on Data Mining, pp. 7–14. Chicago (April 2001)
Bianchini, M., Gori, M., Scarselli, F.: Inside pageRank. ACM Transactions on Internet Technology 5, 92–128 (2005)
Borges, J., Levene, M.: Data mining of user navigation patterns. In: Masand, B., Spiliopoulou, M. (eds.) Web Usage Analysis and User Profiling, Lecture Notes in Artificial Intelligence (LNAI 1836), pp. 92–111. Springer, Berlin Heidelberg New York (2000)
Borges, J., Levene, M.: A fine grained heuristic to capture web navigation patterns. SIGKDD Explorations 2, 40–50 (2000)
Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. In Proceedings of International World Wide Web Conference, pp. 107–117. Brisbane (1998)
Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, A., Stata, R., Tomkins, A., Wiener, J.: Graph structure in the web. Comput. Networks 33, 309–320 (2000)
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top K lists. SIAM J. Discrete Math. 17, 134–160 (2003)
Gray, R.M.: Entropy and Information Theory. Springer, Berlin Heidelberg New York (1990)
Henzinger, M.R.: Hyperlink analysis for the web. IEEE Internet Computing 5, 45–50 (2001)
Huberman, B.A., Pirolli, P.L.T., Pitkow, J.E., Lukose, R.M.: Strong regularities in world wide web surfing. Science 280, 95–97 (1998)
Kemeny, J.G., Snell, J.L.: Finite Markov Chains. Van Nostrand, Princeton, New Jersey (1960)
Khinchin, A.I.: Mathematical Foundations of Information Theory. Dover, New York, New York (1957). Translated by R.A. Silverman and M.D. Friedman.
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46, 604–632 (1999)
Langville, A.N., Meyer, C.D.: Deeper inside PageRank. Internet Mathematics 1, 335–380 (2004)
Levene, M., Loizou, G.: Computing the entropy of user navigation in the web. International Journal of Information Technology and Decision Making 2, 459–476 (2003)
Mobasher, B.: Web usage mining and personalization. In: Singh, M.P. (ed.) Practical Handbook of Internet Computing. CRC, Boca Raton, Florida. (2004)
Page, L., Brin, S., Motwani, R., Winograd. T.: The Pagerank Citation Ranking: Bringing Order to the Web. Working paper, Department of Computer Science, Stanford University (1998)
Pandurangan, E.U.G., Raghavan, P.: Using PageRank to Characterize Web Structure. In: Proceedings of the International Conference on Computing and Combinatorics (COCOON), pp. 330–229, Singapore (2002)
Perkowitz, M., Etzioni, O.: Towards adaptive web sites: conceptual framework and case study. Artif. Intell. 118(2000), 245–275 (2000)
Spiliopoulou, M., Mobasher, B., Berendt, B., Nakagawa, M.: A framework for the evaluation of session reconstruction heuristics in web usage analysis. INFORMS J. Comput. 15, 171–190 (2003)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Borges, J., Levene, M. Ranking Pages by Topology and Popularity within Web Sites. World Wide Web 9, 301–316 (2006). https://doi.org/10.1007/s11280-006-8558-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-006-8558-y