Advertisement

Hierarchical Link Analysis for Ranking Web Data

  • Renaud Delbru
  • Nickolai Toupikov
  • Michele Catasta
  • Giovanni Tummarello
  • Stefan Decker
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6089)

Abstract

On the Web of Data, entities are often interconnected in a way similar to web documents. Previous works have shown how PageRank can be adapted to achieve entity ranking. In this paper, we propose to exploit locality on the Web of Data by taking a layered approach, similar to hierarchical PageRank approaches. We provide justifications for a two-layer model of the Web of Data, and introduce DING (Dataset Ranking) a novel ranking methodology based on this two-layer model. DING uses links between datasets to compute dataset ranks and combines the resulting values with semantic-dependent entity ranking strategies. We quantify the effectiveness of the approach with other link-based algorithms on large datasets coming from the Sindice search engine. The evaluation which includes a user study indicates that the resulting rank is better than the other approaches. Also, the resulting algorithm is shown to have desirable computational properties such as parallelisation.

References

  1. 1.
    Ding, L., Pan, R., Finin, T.W., Joshi, A., Peng, Y., Kolari, P.: Finding and ranking knowledge on the semantic web. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 156–170. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  2. 2.
    Hogan, A., Harth, A., Decker, S.: Reconrank: A scalable ranking method for semantic web data with context. In: Proceedings of Second International Workshop on Scalable Semantic Web Knowledge Base Systems, Athens, GA, USA (November 2006)Google Scholar
  3. 3.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab (1999)Google Scholar
  4. 4.
    Harth, A., Kinsella, S., Decker, S.: Using Naming Authority to Rank Data and Ontologies for Web Search. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 277–292. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Xing, W., Ghorbani, A.: Weighted pagerank algorithm. In: CNSR 2004: Proceedings of the Second Annual Conference on Communication Networks and Services Research, Washington, DC, USA, pp. 305–314. IEEE Computer Society, Los Alamitos (2004)CrossRefGoogle Scholar
  6. 6.
    Baeza-Yates, R., Davis, E.: Web page ranking using link attributes. In: Proceedings of the 13th International World Wide Web Conference on Alternate Track Papers & Posters, pp. 328–329. ACM, New York (2004)CrossRefGoogle Scholar
  7. 7.
    Nie, Z., Zhang, Y., Wen, J.R., Ma, W.Y.: Object-level ranking: bringing order to Web objects. In: Proceedings of the 14th International Conference on World Wide Web, p. 567. ACM, New York (2005)CrossRefGoogle Scholar
  8. 8.
    Balmin, A., Hristidis, V., Papakonstantinou, Y.: Objectrank: authority-based keyword search in databases. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB Endowment, pp. 564–575 (2004)Google Scholar
  9. 9.
    Kamvar, S., Haveliwala, T., Manning, C., Golub, G.: Exploiting the block structure of the web for computing pagerank. Technical Report 2003-17, Stanford InfoLab (2003)Google Scholar
  10. 10.
    Eiron, N., McCurley, K.S., Tomlin, J.A.: Ranking the Web Frontier. In: Proceedings of the 13th Conference on World Wide Web, vol. 2, pp. 309–318. ACM Press, New York (2004)CrossRefGoogle Scholar
  11. 11.
    Wang, Y., DeWitt, D.J.: Computing pagerank in a distributed internet search system. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB Endowment, Toronto, Canada, pp. 420–431 (2004)Google Scholar
  12. 12.
    Xue, G.R., Yang, Q., Zeng, H.J., Yu, Y., Chen, Z.: Exploiting the hierarchical structure for link analysis. In: Proceedings of the 28th Annual International ACM SIGIR Conference, pp. 186–193. ACM, New York (2005)Google Scholar
  13. 13.
    Feng, G., Liu, T.Y., Wang, Y., Bao, Y., Ma, Z., Zhang, X.D., Ma, W.Y.: Aggregaterank: bringing order to web sites. In: Proceedings of the 29th Annual International ACM SIGIR Conference, p. 75. ACM Press, New York (2006)Google Scholar
  14. 14.
    Broder, A.Z., Lempel, R., Maghoul, F., Pedersen, J.: Efficient pagerank approximation via graph aggregation. Information Retrieval 9, 123–138 (2006)CrossRefGoogle Scholar
  15. 15.
    Anyanwu, K., Maduko, A., Sheth, A.: Semrank: ranking complex relationship search results on the semantic web. In: Proceedings of the 14th International Conference on World Wide Web, pp. 117–127. ACM, New York (2005)CrossRefGoogle Scholar
  16. 16.
    Toupikov, N., Umbrich, J., Delbru, R., Hausenblas, M., Tummarello, G.: DING! Dataset Ranking using Formal Descriptions. In: WWW 2009 Workshop: Linked Data on the Web (LDOW 2009), Madrid, Spain (2009)Google Scholar
  17. 17.
    Najork, M.A., Zaragoza, H., Taylor, M.J.: Hits on the web: how does it compare? In: Proceedings of the 30th Annual International Annual ACM Conference on Research and Development in Information Retrieval (2007)Google Scholar
  18. 18.
    Sayyadi, H., Getoor, L.: Futurerank: Ranking scientific articles by predicting their future pagerank. In: SDM, pp. 533–544 (2009)Google Scholar
  19. 19.
    Walker, D., Xie, H., Yan, K.K., Maslov, S.: Ranking scientific publications using a simple model of network traffic. In: CoRR (2006)Google Scholar
  20. 20.
    Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Communications of the ACM 51(1), 6 (2008)CrossRefGoogle Scholar
  21. 21.
    Melucci, M.: On rank correlation in information retrieval evaluation. SIGIR Forum 41(1), 18–33 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Renaud Delbru
    • 1
  • Nickolai Toupikov
    • 1
  • Michele Catasta
    • 2
  • Giovanni Tummarello
    • 1
    • 3
  • Stefan Decker
    • 1
  1. 1.Digital Enterprise Research InstituteNational University of Ireland, GalwayGalwayIreland
  2. 2.School of Computer and Communication SciencesÉcole Polytechnique Fédérale de Lausanne (EPFL)LausanneSwitzerland
  3. 3.Fondazione Bruno KesslerTrentoItaly

Personalised recommendations