What’s in a Link? From Document Importance to Topical Relevance

  • Marijn Koolen
  • Jaap Kamps
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5766)


Web information retrieval is best known for its use of the Web’s link structure as a source of evidence. Global link evidence is by nature query-independent, and is therefore no direct indicator of the topical relevance of a document for a given search request. As a result, link information is usually considered to be useful to identify the ‘importance’ of documents. Local link evidence, in contrast, is query-dependent and could in principle be related to the topical relevance. We analyse the link evidence in Wikipedia using a large set of ad hoc retrieval topics and relevance judgements to investigate the relation between link evidence and topical relevance.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Carrière, S.J., Kazman, R.: Webquery: Searching and visualizing the web through connectivity. Computer Networks 29(8-13), 1257–1267 (1997)Google Scholar
  2. 2.
    Denoyer, L., Gallinari, P.: The Wikipedia XML Corpus. SIGIR Forum 40(1), 64–69 (2006)CrossRefGoogle Scholar
  3. 3.
    Hawking, D.: Overview of the TREC-9 web track. In: The Ninth Text REtrieval Conference (TREC-9), pp. 87–102. NIST Special Publication 500-249 (2001)Google Scholar
  4. 4.
    Hawking, D., Craswell, N.: Very large scale retrieval and web search. In: TREC: Experiment and Evaluation in Information Retrieval, ch. 9, MIT Press, Cambridge (2005)Google Scholar
  5. 5.
    Kamps, J., Koolen, M.: The importance of link evidence in Wikipedia. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 270–282. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Kamps, J., Koolen, M.: Is Wikipedia link structure different? In: Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM 2009). ACM Press, New York (2009)Google Scholar
  7. 7.
    Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46, 604–632 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Kraaij, W., Westerveld, T., Hiemstra, D.: The importance of prior probabilities for entry page search. In: Proceedings of the 25th Annual International ACM SIGIR Conference, pp. 27–34. ACM Press, New York (2002)Google Scholar
  9. 9.
    Najork, M., Zaragoza, H., Taylor, M.: Hits on the web: How does it compare? In: SIGIR 2007 (2007)Google Scholar
  10. 10.
    Ogilvie, P., Callan, J.: Combining document representations for known-item search. In: Proceedings of the 26th Annual International ACM SIGIR Conference, pp. 143–150. ACM Press, New York (2003)Google Scholar
  11. 11.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford Digital Library Technologies Project (1998)Google Scholar
  12. 12.
    Saracevic, T.: Relevance: A review of and a framework for the thinking on the notion in information science. Journal of the American Society for Information Science 26, 321–343 (1975)CrossRefGoogle Scholar
  13. 13.
    TREC. Text-REtrieval Conference (2009),
  14. 14.

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Marijn Koolen
    • 1
  • Jaap Kamps
    • 1
    • 2
  1. 1.Archives and Information StudiesUniversity of AmsterdamThe Netherlands
  2. 2.ISLAUniversity of AmsterdamThe Netherlands

Personalised recommendations