Advertisement

Diversity-Based HITS: Web Page Ranking by Referrer and Referral Diversity

  • Yoshiyuki Shoji
  • Katsumi Tanaka
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8238)

Abstract

We propose a Web ranking method that considers the diversity of linked pages and linking pages. Typical link analysis algorithms such as HITS and PageRank calculate scores by the number of linking pages. However, even if the number of links is the same, there is a big difference between documents linked by pages with similar content and those linked by pages with very different content. We propose two types of link diversity, referral diversity (diversity of pages linked by the page) and referrer diversity (diversity of pages linking to the page), and use the resulting diversity scores to expand the basic HITS algorithm. The results of repeated experiments showed that the diversity-based method is more useful than the original HITS algorithm for finding useful information on the Web.

Keywords

Novice User Social Bookmark Search Result Page Popularity Score Link Analysis Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Surowiecki, J.: The wisdom of crowds. Anchor (2005)Google Scholar
  2. 2.
    Carbonell, J., Goldstein, J.: The use of mmr, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1998, pp. 335–336. ACM, New York (1998)CrossRefGoogle Scholar
  3. 3.
    Wang, J., Zhu, J.: Portfolio theory of information retrieval. In: Proceedings of the 32nd international ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, pp. 115–122. ACM, New York (2009)CrossRefGoogle Scholar
  4. 4.
    Capannini, G., Nardini, F.M., Perego, R., Silvestri, F.: Efficient diversification of web search results. Proc. VLDB Endow. 4(7), 451–459 (2011)Google Scholar
  5. 5.
    Minack, E., Demartini, G., Nejdl, W.: Current approaches to search result diversification. In: Proceedings of The First International Workshop on Living Web at the 8th International Semantic Web Conference (ISWC) (October 2009)Google Scholar
  6. 6.
    Stirling, A.: A general framework for analysing diversity in science, technology and society. Journal of the Royal Society Interface 4(15), 707–719 (2007)CrossRefGoogle Scholar
  7. 7.
    Haveliwala, T.: Topic-sensitive pagerank: a context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering 15(4), 784–796 (2003)CrossRefGoogle Scholar
  8. 8.
    Gyöngyi, Z., Garcia-Molina, H., Pedersen, J.: Combating web spam with trustrank. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004. VLDB Endowment, vol. 30, pp. 576–587 (2004)Google Scholar
  9. 9.
    Takahashi, Y., Ohshima, H., Yamamoto, M., Iwasaki, H., Oyama, S., Tanaka, K.: Evaluating significance of historical entities based on tempo-spatial impacts analysis using wikipedia link structure. In: Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia, HT 2011, pp. 83–92. ACM, New York (2011)CrossRefGoogle Scholar
  10. 10.
    Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Deng, H., Lyu, M.R., King, I.: A generalized co-hits algorithm and its application to bipartite graphs. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2009, pp. 239–248. ACM, New York (2009)CrossRefGoogle Scholar
  12. 12.
    Lempel, R., Moran, S.: The stochastic approach for link-structure analysis (salsa) and the tkc effect. Computer Networks 33(1-6), 387–401 (2000)CrossRefGoogle Scholar
  13. 13.
    Tong, H.: Fast random walk with restart and its applications. In. In: ICDM 2006: Proceedings of the 6th IEEE International Conference on Data Mining, pp. 613–622. IEEE Computer Society (2006)Google Scholar
  14. 14.
    Nakatani, M., Jatowt, A., Ohshima, H., Tanaka, K.: Quality evaluation of search results by typicality and speciality of terms extracted from wikipedia. In: Zhou, X., Yokota, H., Deng, K., Liu, Q. (eds.) DASFAA 2009. LNCS, vol. 5463, pp. 570–584. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  15. 15.
    Akamatsu, K., Pattanasri, N., Jatowt, A., Tanaka, K.: Measuring comprehensibility of web pages based on link analysis. In: Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, WI-IAT 2011, pp. 40–46. IEEE Computer Society Press, Washington, DC (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Yoshiyuki Shoji
    • 1
  • Katsumi Tanaka
    • 1
  1. 1.Department of Social Informatics, Graduate School of InformaticsKyoto UniversityKyotoJapan

Personalised recommendations