Where to Start Browsing the Web?

Fogaras, Dániel

doi:10.1007/978-3-540-39884-4_6

Dániel Fogaras^7,8

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2877))

Included in the following conference series:

International Workshop on Innovative Internet Community Systems

329 Accesses
22 Citations

Abstract

Both human users and crawlers face the problem of finding good start pages to explore some topic. We show how to assist in qualifying pages as start nodes by link-based ranking algorithms. We introduce a class of hub ranking methods based on counting the short search paths of the Web. Somewhat surprisingly, the Page Rank scores computed on the reversed Web graph turn out to be a special case of our class of rank functions. Besides query based examples, we propose graph based techniques to evaluate the performance of the introduced ranking algorithms. Centrality analysis experiments show that a small portion of Web pages induced by the top ranked pages dominates the Web in the sense that other pages can be accessed from them within a few clicks on the average; furthermore the removal of such nodes destroys the connectivity of the Web graph rapidly. By calculating the dominations and connectivity decay we compare and analyze the proposed ranking algorithms without the need of human interaction solely from the structure of the Web. Apart from ranking algorithms, the existence of central pages is interesting in its own right, providing a deeper insight to the Small World property of the Web graph.

Research is supported by grants OTKA T 42559 and T 042706 of the Hungarian National Science Fund.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Albert, R., Jeong, H., Barabási, A.: Error and attack tolerance of complex networks. Nature 406, 378–382 (2000)
Article Google Scholar
Amento, B., Terveen, L., Hill, W.: Does authority mean quality? Predicting expert quality ratings of web documents. In: Proceedings of the Twenty-Third Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York (2000)
Google Scholar
Azar, Y., Fiat, A., Karlin, A.R., McSherry, F., Saia, J.: Spectral analysis of data. In: ACM Symposium on Theory of Computing, pp. 619–626 (2001)
Google Scholar
Barabási, A.-L., Albert, R., Jeong, H.: Scale-free characteristics of random networks: the topology of the word-wide web. Physica A 281, 69–77 (2000)
Article Google Scholar
Borodin, A., Roberts, G.O., Rosenthal, J.S., Tsaparas, P.: Finding authorities and hubs from link structures on the world wide web. In: 10th International World Wide Web Conference, pp. 415–429 (2001)
Google Scholar
Boyan, J., Freitag, D., Joachims, T.: A machine learning architecture for optimizing web search engines. In: Proceedings of the AAAI Workshop on Internet-Based Information Systems (1996)
Google Scholar
Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)
Article Google Scholar
Chakrabarti, S., Dom, B.E., Kumar, S.R., Raghavan, P., Rajagopalan, S., Tomkins, A., Gibson, D., Kleinberg, J.: Mining the Web’s link structure. Computer 32(8), 60–67 (1999)
Article Google Scholar
Davison, B.D., Gerasoulis, A., Kleisouris, K., Lu, Y., ju Seo, H., Wang, W., Wu, B.: Discoweb: Applying link analysis to web search. In: Proceedings of the 8th World Wide Web Conference, Toronto, Canada (1999)
Google Scholar
Dwork, C., Kumar, S.R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: 10th International World Wide Web Conference, Hong Kong, pp. 613–622 (2001)
Google Scholar
Garey, M., Johnson, D.: Computer and Intractability: A Guide to the Theory of NP-completeness. W.H. Freeman, San Fransisco (1979)
Google Scholar
Google. Commercial search engine founded by the originators of pagerank, located at, http://www.google.com
Haveliwala, T.H.: Topic-sensitive pagerank. In: 11th International World Wide Web Conference, Honolulu, Hawaii (2002)
Google Scholar
Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)
Article MATH Google Scholar
Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Article MATH MathSciNet Google Scholar
Larbin. Multi-purpose web crawler
Google Scholar
Lempel, R., Moran, S.: The stochastic approach for link-structure analysis (SALSA) and the TKC effect. In: 9th International World Wide Web Conference (2000)
Google Scholar
Marchiori, M.: The quest for correct information on the web: Hyper search engines. In: 7th International World Wide Web Conference (1998)
Google Scholar
Ng, A.Y., Zheng, A.X., Jordan, M.: Stable algorithms for link analysis. In: Proc. 24th Annual Intl. ACM SIGIR Conference (2001)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)
Google Scholar
Richardson, M., Domingos, P.: The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank. In: Advances in Neural Information Processing Systems 14, MIT Press, Cambridge (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Theory, Budapest University of Technology and Economics, H–1521, Budapest, Hungary
Dániel Fogaras
Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA SZTAKI), 11 Lágymányosi u., H–1111, Budapest, Hungary
Dániel Fogaras

Authors

Dániel Fogaras
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Mathematik, Technische Universität Ilmenau, PF 100565, 98684, Ilmenau, Germany
Thomas Böhme
Institute of Computer Science, University of Leipzig,
Gerhard Heyer
Computer Science Dept., University of Rostock, 18051, Rostock, Germany
Herwig Unger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fogaras, D. (2003). Where to Start Browsing the Web?. In: Böhme, T., Heyer, G., Unger, H. (eds) Innovative Internet Community Systems. IICS 2003. Lecture Notes in Computer Science, vol 2877. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39884-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-540-39884-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20436-7
Online ISBN: 978-3-540-39884-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics