Abstract
All search happens in a particular context—such as the particular collection of a digital library, its associated search tasks, and its associated users. Information retrieval researchers usually agree on the importance of context, but they rarely address the issue. In particular, evaluation in the Cranfield tradition requires abstracting away from individual differences between users. This paper investigates if we can bring some of this context into the Cranfield paradigm. Our approach is the following: we will attempt to record the “context” of the humans already in the loop—the topic authors/assessors—by designing targeted questionnaires. The questionnaire data becomes part of the evaluation test-suite as valuable data on the context of the search requests. We have experimented with this questionnaire approach during the evaluation campaign of the INitiative for the Evaluation of XML Retrieval (INEX). The results of this case study demonstrate the viability of the questionnaire approach as a means to capture context in evaluation. This can help explain and control some of the user or topic variation in the test collection. Moreover, it allows to break down the set of topics in various meaningful categories, e.g. those that suit a particular task scenario, and zoom in on the relative performance for such a group of topics.
This research was partly funded by DELOS (an EU network of excellence in Digital Libraries) through the INEX initiative for the evaluation of XML retrieval.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Allan, J.: HARD track overview in TREC 2005: High accuracy retrieval from documents. In: The Fourteenth Text REtrieval Conference (TREC 2003). National Institute of Standards and Technology, pp. 500–255. NIST Special Publication (2004)
Banks, D., Over, P., Zhang, N.-F.: Blind men and elephants: Six approaches to TREC tasks. Information Retrieval 1, 7–34 (1999)
Buckley, C.: Why current IR engines fail. In: Proceedings of the 27th Annual International ACM SIGIR Conference, pp. 584–585. ACM Press, New York (2004)
Cleverdon, C.W.: The Cranfield tests on index language devices. Aslib 19, 173–192 (1967)
Fuhr, N., Kamps, J., Lalmas, M., Malik, S., Trotman, A.: Overview of the INEX 2007 ad hoc track. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 1–23. Springer, Heidelberg (2008)
Ingwersen, P., Järvelin, K.: The Turn: Integration of Information Seeking and Retrieval in Context. Springer, Heidelberg (2005)
Kamps, J., Koolen, M., Lalmas, M.: Locating relevant text within XML documents. In: Proceedings of the 31th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 847–849. ACM Press, New York (2008)
Kamps, J., Pehcevski, J., Kazai, G., Lalmas, M., Robertson, S.: INEX 2007 evaluation measures. In: Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.) INEX 2007. LNCS, vol. 4862, pp. 24–33. Springer, Heidelberg (2008)
Kazai, G., Lalmas, M., de Vries, A.P.: The overlap problem in content-oriented XML retrieval evaluation. In: Proceedings of the 27th Annual International ACM SIGIR Conference, pp. 72–79. ACM Press, New York (2004)
Piwowarski, B., Trotman, A., Lalmas, M.: Sound and complete relevance assessments for XML retrieval. ACM Transactions in Information Systems 27(1) (2008)
Reid, J., Lalmas, M., Finesilver, K., Hertzum, M.: Best entry points for structured document retrieval: Parts I & II. Information Processing and Management 42, 74–105 (2006)
Saracevic, T.: Digital library evaluation: Toward evolution of concepts. Library Trends – Special issue on Evaluation of Digital Libraries 49(2), 350–369 (2000)
Saracevic, T.: Relevance: A review of and a framework for the thinking on the notion in information science. JASIS 26, 321–343 (1975)
Sparck Jones, K.: What’s the value of TREC – is there a gap to jump or a chasm to bridge? SIGIR Forum 40, 10–20 (2006)
Voorhees, E.M.: The philosophy of information retrieval evaluation. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds.) CLEF 2001. LNCS, vol. 2406, pp. 355–370. Springer, Heidelberg (2002)
Zobel, J.: How reliable are the results of large-scale information retrieval experiments? In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 307–314. ACM Press, New York (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kamps, J., Lalmas, M., Larsen, B. (2009). Evaluation in Context. In: Agosti, M., Borbinha, J., Kapidakis, S., Papatheodorou, C., Tsakonas, G. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2009. Lecture Notes in Computer Science, vol 5714. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04346-8_33
Download citation
DOI: https://doi.org/10.1007/978-3-642-04346-8_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04345-1
Online ISBN: 978-3-642-04346-8
eBook Packages: Computer ScienceComputer Science (R0)