An Analysis on Topic Features and Difficulties Based on Web Navigational Retrieval Experiments

  • Masao Takaku
  • Keizo Oyama
  • Akiko Aizawa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4182)


We analyze the relationship between topic features and difficulties in Web navigational retrieval tasks based on the experiments done on the NTCIR-5 Web test collection. Our analysis shows that the difficulties of a retrieval task are closely related to the specificity of the topic, and topics that are of some particular categories are more difficult than others. For example, a representative page of a company or an organization is easier on average to find than that of a person, a product, or an event. Our results show that adding metadata on a topic would potentially be useful for search engines to predict the difficulty of the task. Additionally, we show that the number of unique documents retrieved from different systems weakly correlates with the query’s performance.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Oyama, K., Takaku, M., Ishikawa, H., Aizawa, A., Yamana, H.: Overview of the NTCIR-5 WEB navigational retrieval subtask 2 (Navi-2). In: Kando, N., Takaku, M. (eds.) Proceedings of NTCIR-5 Workshop Meeting, pp. 242–222 (2005)Google Scholar
  2. 2.
    Carmel, D., Yom-Tov, E., Soboroff, I.: SIGIR workshop report: predicting query difficulty - methods and applications. In: SIGIR Forum, vol. 39, pp. 25–28 (2005)Google Scholar
  3. 3.
    Eguchi, K., Kuriyama, K., Kando, N.: Sensitivity of IR systems evaluation to topic difficulty. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002), Vol. 2, pp. 585–589 (2002)Google Scholar
  4. 4.
    Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 299–306 (2002)Google Scholar
  5. 5.
    He, B., Ounis, I.: Inferring query performance using pre-retrieval predictors. In: Apostolico, A., Melucci, M. (eds.) SPIRE 2004. LNCS, vol. 3246, pp. 43–54. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Yom-Tov, E., Fine, S., Carmel, D., Darlow, A.: Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. In: SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 512–519 (2005)Google Scholar
  7. 7.
    Jensen, E.C., Beitzel, S.M., Grossman, D., Frieder, O., Chowdhury, A.: Predicting query difficulty on the Web by learning visual clues. In: SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, pp. 615–616 (2005)Google Scholar
  8. 8.
    Oyama, K., Ishikawa, H., Eguchi, K., Aizawa, A.: The test collection for navigational retrieval on WWW data—design and characteristics. Progress in Informatics, 59–73 (2005)Google Scholar
  9. 9.
    Plachouras, V., He, B., Ounis, I.: University of Glasgow at TREC 2004: Experiments in web, robust and terabyte tracks with Terrier. In: Proceedings of the 13th Text REtrieval Conference, TREC 2004 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Masao Takaku
    • 1
  • Keizo Oyama
    • 2
  • Akiko Aizawa
    • 2
  1. 1.Research Organization of Information and Systems 
  2. 2.National Institute of InformaticsTokyoJapan

Personalised recommendations