Skip to main content

Where to Start Browsing the Web?

  • Conference paper
Innovative Internet Community Systems (IICS 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2877))

Included in the following conference series:

Abstract

Both human users and crawlers face the problem of finding good start pages to explore some topic. We show how to assist in qualifying pages as start nodes by link-based ranking algorithms. We introduce a class of hub ranking methods based on counting the short search paths of the Web. Somewhat surprisingly, the Page Rank scores computed on the reversed Web graph turn out to be a special case of our class of rank functions. Besides query based examples, we propose graph based techniques to evaluate the performance of the introduced ranking algorithms. Centrality analysis experiments show that a small portion of Web pages induced by the top ranked pages dominates the Web in the sense that other pages can be accessed from them within a few clicks on the average; furthermore the removal of such nodes destroys the connectivity of the Web graph rapidly. By calculating the dominations and connectivity decay we compare and analyze the proposed ranking algorithms without the need of human interaction solely from the structure of the Web. Apart from ranking algorithms, the existence of central pages is interesting in its own right, providing a deeper insight to the Small World property of the Web graph.

Research is supported by grants OTKA T 42559 and T 042706 of the Hungarian National Science Fund.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Albert, R., Jeong, H., Barabási, A.: Error and attack tolerance of complex networks. Nature 406, 378–382 (2000)

    Article  Google Scholar 

  2. Amento, B., Terveen, L., Hill, W.: Does authority mean quality? Predicting expert quality ratings of web documents. In: Proceedings of the Twenty-Third Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, New York (2000)

    Google Scholar 

  3. Azar, Y., Fiat, A., Karlin, A.R., McSherry, F., Saia, J.: Spectral analysis of data. In: ACM Symposium on Theory of Computing, pp. 619–626 (2001)

    Google Scholar 

  4. Barabási, A.-L., Albert, R., Jeong, H.: Scale-free characteristics of random networks: the topology of the word-wide web. Physica A 281, 69–77 (2000)

    Article  Google Scholar 

  5. Borodin, A., Roberts, G.O., Rosenthal, J.S., Tsaparas, P.: Finding authorities and hubs from link structures on the world wide web. In: 10th International World Wide Web Conference, pp. 415–429 (2001)

    Google Scholar 

  6. Boyan, J., Freitag, D., Joachims, T.: A machine learning architecture for optimizing web search engines. In: Proceedings of the AAAI Workshop on Internet-Based Information Systems (1996)

    Google Scholar 

  7. Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1-7), 107–117 (1998)

    Article  Google Scholar 

  8. Chakrabarti, S., Dom, B.E., Kumar, S.R., Raghavan, P., Rajagopalan, S., Tomkins, A., Gibson, D., Kleinberg, J.: Mining the Web’s link structure. Computer 32(8), 60–67 (1999)

    Article  Google Scholar 

  9. Davison, B.D., Gerasoulis, A., Kleisouris, K., Lu, Y., ju Seo, H., Wang, W., Wu, B.: Discoweb: Applying link analysis to web search. In: Proceedings of the 8th World Wide Web Conference, Toronto, Canada (1999)

    Google Scholar 

  10. Dwork, C., Kumar, S.R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: 10th International World Wide Web Conference, Hong Kong, pp. 613–622 (2001)

    Google Scholar 

  11. Garey, M., Johnson, D.: Computer and Intractability: A Guide to the Theory of NP-completeness. W.H. Freeman, San Fransisco (1979)

    Google Scholar 

  12. Google. Commercial search engine founded by the originators of pagerank, located at, http://www.google.com

  13. Haveliwala, T.H.: Topic-sensitive pagerank. In: 11th International World Wide Web Conference, Honolulu, Hawaii (2002)

    Google Scholar 

  14. Katz, L.: A new status index derived from sociometric analysis. Psychometrika 18(1), 39–43 (1953)

    Article  MATH  Google Scholar 

  15. Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  16. Larbin. Multi-purpose web crawler

    Google Scholar 

  17. Lempel, R., Moran, S.: The stochastic approach for link-structure analysis (SALSA) and the TKC effect. In: 9th International World Wide Web Conference (2000)

    Google Scholar 

  18. Marchiori, M.: The quest for correct information on the web: Hyper search engines. In: 7th International World Wide Web Conference (1998)

    Google Scholar 

  19. Ng, A.Y., Zheng, A.X., Jordan, M.: Stable algorithms for link analysis. In: Proc. 24th Annual Intl. ACM SIGIR Conference (2001)

    Google Scholar 

  20. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)

    Google Scholar 

  21. Richardson, M., Domingos, P.: The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank. In: Advances in Neural Information Processing Systems 14, MIT Press, Cambridge (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fogaras, D. (2003). Where to Start Browsing the Web?. In: Böhme, T., Heyer, G., Unger, H. (eds) Innovative Internet Community Systems. IICS 2003. Lecture Notes in Computer Science, vol 2877. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39884-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39884-4_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20436-7

  • Online ISBN: 978-3-540-39884-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics