Information Retrieval

, Volume 13, Issue 5, pp 460–484 | Cite as

Expected reading effort in focused retrieval evaluation

  • Paavo ArvolaEmail author
  • Jaana Kekäläinen
  • Marko Junkkari
Focused Retrieval and Result Aggr.


This study introduces a novel framework for evaluating passage and XML retrieval. The framework focuses on a user’s effort to localize relevant content in a result document. Measuring the effort is based on a system guided reading order of documents. The effort is calculated as the quantity of text the user is expected to browse through. More specifically, this study seeks evaluation metrics for retrieval methods following a specific fetch and browse approach, where in the fetch phase documents are ranked in decreasing order according to their document score, like in document retrieval. In the browse phase, for each retrieved document, a set of non-overlapping passages representing the relevant text within the document is retrieved. In other words, the passages of the document are re-organized, so that the best matching passages are read first in sequential order. We introduce an application scenario motivating the framework, and propose sample metrics based on the framework. These metrics give a basis for the comparison of effectiveness between traditional document retrieval and passage/XML retrieval and illuminate the benefit of passage/XML retrieval.


Passage retrieval XML retrieval Evaluation Metrics Small screen devices 



The study was supported by Academy of Finland under grants #115480 and #130482.


  1. Ali, M. S., Consens, M. P., Kazai, G., & Lalmas, M. (2008). Structural relevance: A common basis for the evaluation of structured document retrieval. In Proceedings of CIKM ‘08 (pp. 1153–1162).Google Scholar
  2. Allan, J. (2004). Hard track overview in TREC 2004: High accuracy retrieval from documents. In Proceedings of the 13th text retrieval conference (TREC 2004). Nist Special Publication, SP 500-261, 11 pages.Google Scholar
  3. Arvola, P., Junkkari, M., & Kekäläinen, J. (2006). Applying XML retrieval methods for result document navigation in small screen devices. In Proceedings of MobileHCI workshop for ubiquitous information access (pp. 6–10).Google Scholar
  4. Buyukkokten, O., Garcia-Molina, H., Paepcke, A., & Winograd, T. (2000). Power browser: Efficient web browsing for PDAs. In Proceedings of CHI ‘2000 (pp. 430–437).Google Scholar
  5. Chiaramella, Y., Mulhem, P., & Fourel, F. (1996). A model for multimedia search information retrieval. Technical report, basic research action FERMI 8134.Google Scholar
  6. Cooper, W. (1968). Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American Documentation, 19(1), 30–41.CrossRefGoogle Scholar
  7. de Vries, A. P., Kazai, G., & Lalmas, M. (2004). Tolerance to irrelevance: A user-effort oriented evaluation of retrieval systems without predefined retrieval unit. In Proceedings of RIAO 2004 (pp. 463–473).Google Scholar
  8. Denoyer, L., & Gallinari, P. (2006). The Wikipedia XML corpus. SIGIR Forum, 40(1), 64–69.CrossRefGoogle Scholar
  9. Dunlop, M. D. (1997). Time, relevance and interaction modelling for information retrieval. In Proceedings of SIGIR ‘97 (pp. 206–212).Google Scholar
  10. Finesilver K., & Reid J. (2003). User behaviour in the context of structured documents. In Proceedings of ECIR 2003, LNCS 2633 (pp. 104–119).Google Scholar
  11. Hyönä, J., & Nurminen, A.-M. (2006). Do adult readers know how they read? Evidence from eye movement patterns and verbal reports. British Journal of Psychology, 97(1), 31–50.CrossRefGoogle Scholar
  12. Ibekwe-SanJuan, F., & SanJuan, E. (2009). Use of multiword terms and query expansion for interactive information retrieval. In Advances in Focused Retrieval, LNCS 5631 (pp. 54–64).Google Scholar
  13. INEX (Initiative for the Evaluation of XML Retrieval) home pages. (2009). Retrieved January 23, 2009 from
  14. Itakura, K., & Clarke, C. L. K. (2009). University of Waterloo at INEX 2008: Adhoc, book, and link-the-wiki tracks. In Advances in Focused Retrieval, LNCS 5631 (pp. 132–139).Google Scholar
  15. Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transaction on Information Systems, 20(4), 422–446.CrossRefGoogle Scholar
  16. Jones, M., Buchanan, G., & Mohd-Nasir, N. (1999). Evaluation of WebTwig—a site outliner for handheld Web access. In Proceedings of international symposium on handheld and ubiquitous computing, LNCS 1707 (pp. 343–345).Google Scholar
  17. Kamps, J., Geva, S., Trotman, A., Woodley, A., & Koolen, M. (2008c). Overview of the INEX 2008 ad hoc track. In INEX 2008 workshop pre-proceedings (pp. 1–28).Google Scholar
  18. Kamps, J., Koolen, M., & Lalmas, M. (2008a). Locating relevant text within XML documents. In Proceedings of SIGIR’08 (pp. 847–848).Google Scholar
  19. Kamps, J., Lalmas, M., & Pehcevski, J. (2007). Evaluating relevant in context: Document retrieval with a twist. In Proceedings SIGIR ‘07 (pp. 749–750).Google Scholar
  20. Kamps, J., Pehcevski, J., Kazai, G., Lalmas, M., & Robertson, S. (2008b). INEX 2007 evaluation measures. In INEX 2007, LNCS 4862 (pp. 24–33).Google Scholar
  21. Kazai, G., & Lalmas, M. (2006). Extended cumulated gain measures for the evaluation of content-oriented XML retrieval. ACM Transaction on Information Systems, 24(4), 503–542.CrossRefGoogle Scholar
  22. Kekäläinen, J., & Järvelin, K. (2002). Using graded relevance assessments in IR evaluation. Journal of the American Society for Information Science and Technology, 53, 1120–1129.CrossRefGoogle Scholar
  23. Opera Software ASA, Opera Mini™ for Mobile. (2006). Retrieved January 21, 2009 from
  24. Piwowarski, P. (2006). EPRUM metrics and INEX 2005. In Proceedings of INEX 2005, LNCS 3977 (pp. 30–42).Google Scholar
  25. Piwowarski, B., & Dupret, G. (2006). Evaluation in (XML) information retrieval: Expected precision-recall with user modelling (EPRUM). In Proceedings of SIGIR’06 (pp. 260–267).Google Scholar
  26. Piwowarski, B., & Lalmas, M. (2004). Providing consistent and exhaustive relevance assessments for XML retrieval evaluation. In Proceedings of CIKM ‘04 (pp. 361–370).Google Scholar
  27. Reid, J., Lalmas, M., Finesilver, K., & Hertzum, M. (2006). Best entry points for structured document retrieval: Parts I & II. Information Processing and Management, 42, 74–105.CrossRefGoogle Scholar
  28. Robertson, S. (2008). A new interpretation of average precision. In Proceedings of SIGIR ‘08 (pp. 689–690).Google Scholar
  29. Saracevic, T. (1996). Relevance reconsidered ‘96. In Proceedings of CoLIS (pp. 201–218). Google Scholar
  30. Tombros, A., Larsen, B., & Malik, S. (2005). Report on the INEX 2004 interactive track. SIGIR Forum, 39, 43–49.CrossRefGoogle Scholar
  31. Trotman, A., Pharo, N., & Lehtonen, M. (2007). XML IR users and use cases. In Proceedings of INEX 2006, LNCS 4518 (pp. 400–412).Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Paavo Arvola
    • 1
    Email author
  • Jaana Kekäläinen
    • 1
  • Marko Junkkari
    • 2
  1. 1.Department of Information Studies and Interactive MediaUniversity of TampereTampereFinland
  2. 2.Department of Computer SciencesUniversity of TampereTampereFinland

Personalised recommendations