Evaluating the effectiveness of relevance feedback based on a user simulation model: effects of a user scenario on cumulated gain value
- First Online:
- 102 Downloads
We propose a method for performing evaluation of relevance feedback based on simulating real users. The user simulation applies a model defining the user’s relevance threshold to accept individual documents as feedback in a graded relevance environment; user’s patience to browse the initial list of retrieved documents; and his/her effort in providing the feedback. We evaluate the result by using cumulated gain-based evaluation together with freezing all documents seen by the user in order to simulate the point of view of a user who is browsing the documents during the retrieval process. We demonstrate the method by performing a simulation in the laboratory setting and present the “branching” curve sets characteristic for the presented evaluation method. Both the average and topic-by-topic results indicate that if the freezing approach is adopted, giving feedback of mixed quality makes sense for various usage scenarios even though the modeled users prefer finding especially the most relevant documents.
KeywordsEvaluation Relevance feedback Simulation User modeling
- Aalbersberg, I. J. (1992). Incremental relevance feedback. In N. J. Belkin, P. Ingwersen, & A. Mark Pejtersen (Eds.), Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 11–22). Copenhagen, Denmark.Google Scholar
- Belkin, N. J., Cool, C., Koenemann, J., Ng, K. B., & Park, S. (1995) Using relevance feedback and ranking in interactive searching. TREC 1995. http://citeseer.ist.psu.edu/belkin96using.html. Accessed 14 Aug 2007.
- Blair, D. C. (1984). The data-document distinction in information retrieval. Communications of the ACM, 4, 27, 369–374.Google Scholar
- Billerbeck, B. (2005). Efficient query expansion. Doctoral thesis. School of Computer Science and Information Technology, Portfolio of Science, Engineering and Technology, RMIT University. Melbourne, Victoria, Australia, 2005. http://goanna.cs.rmit.edu.au/~bodob/pubs/Bil05.pdf. Accessed 14 Aug 2007.
- Broglio, J., Callan, J. P., & Croft, W. B. (1994). INQUERY system overview. In Proceedings of the TIPSTER text program (Phase I) (pp. 47–67).Google Scholar
- Chang, Y. K., Cirillo, C., & Razon, J. (1971). Evaluation of feedback retrieval using modified freezing, residual collection, and test and control groups. In G. Salton (Ed.), The SMART retrieval system: Experiments in automatic document processing (pp. 355–370). London: Prentice-Hall.Google Scholar
- Conover, W. J. (1999). Practical nonparametric statistics (3rd ed., p. 584). New York: Wiley.Google Scholar
- Efthimiadis, E. N. (1996). Query expansion. In M. E. Williams (Ed.), Annual Review of Information Science and Technology, vol. 31 (ARIST 31) (pp. 121–187). Medford, NJ: Learned Information for the American Society for Information Science. http://faculty.washington.edu/efthimis/pubs/Pubs/qe-arist/QE-arist.html. Accessed 14 Aug 2007.
- Jordan, C., Watters, C., & Gao, Q. (2006). Using controlled query generation to evaluate blind relevance feedback algorithms. In ACM/IEEE Joint Conference on Digital Libraries (JCDL’06) (pp. 286–295). http://delivery.acm.org/. Accessed 19 Dec 2007.
- Järvelin K., & Kekäläinen J. (2000). IR evaluation methods for retrieving highly relevant documents. In N. J. Belkin, P. Ingwersen, & M.-K. Leong (Eds.), Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 41–48). Athens, Greece.Google Scholar
- Kando, N. (2000). What shall we evaluate?—Preliminary discussion for the NTCIR patent IR challenge (PIC) based on the brainstorming with the specialized intermediaries in patent searching and patent attorneys. Proceedings of the ACM SIGIR 2000 Workshop on Patent Retrieval. Athens, Greece, July 28, 2000. http://research.nii.ac.jp/ntcir/sigir2000ws/sigirprws-kando.pdf. Accessed 19 Dec 2007.
- Kekäläinen, J. (1999). The effects of query compexity, expansion and structure on retrieval performance in probalistic text retrieval. Doctoral thesis. Tampere, Finland: University of Tampere, Department of Information Studies. Acta Universitatis Tamperensis 678. p. 170.Google Scholar
- Keskustalo, H., Järvelin K, & Pirkola, A. (2006). The effects of relevance feedback quality and quantity in interactive relevance feedback: A simulation based on user modeling. In M. Lalmas, A. MacFarlane, S. Rüger, A. Tombros, T. Tsikrika, & A. Yavlinsky (Eds.), Proceedings of the 28th European Conference on IR Research (ECIR) (pp. 191–204), London, UK.Google Scholar
- Marchionini, G., Dwiggins, S., Katz, A., & Lin, X. (1993). Information seeking in full-text end-user-oriented search systems: The roles of domain and search expertise. Library & Information Science Research, 15(1), 35–70.Google Scholar
- Pirkola, A., Leppänen E, & Järvelin K. (2002). The RATF formula (Kwok’s Formula): Exploiting average term frequency in cross-language retrieval. Information Research, 7(2). http://informationr.net/ir/7-2/paper127.html. Accessed 15 Aug 2007.
- Price, S. L., Nielsen, M. L., Delcambre, L. M. L., & Vedsted P. (2007). Semantic components enhance retrieval of domain-specific documents. Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management (pp. 429–438), Lisbon, Portugal.Google Scholar
- Rocchio, J. J. Jr. (1971). Relevance feedback in information retrieval. In G. Salton (Ed.), The SMART retrieval system: Experiments in automatic document processing (pp. 313–323). Prentice-Hall: London.Google Scholar
- Salton, G. (1989). Automatic text processing: The transformation, analysis and retrieval of information by computer (p. 530). Reading, MA: Addison-Wesley.Google Scholar
- Sormunen, E. (2002). Liberal relevance criteria of TREC—counting on negligible documents? In M. Beaulieu, R. Baeza-Yates, S. H. Myaeng, & K. Järvelin (Eds.), Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 320–330). Tampere, Finland.Google Scholar
- Voorhees, E. M. (2001). Evaluation by highly relevant documents. In W. B. Croft, D. J. Harper, D. H. Kraft, & J. Zobel (Eds.), Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 74–82). New Orleans, Louisiana, USA.Google Scholar
- White, R. W., Jose, J. M., van Rijsbergen, C. J., & Ruthven, I. (2004). A simulated study of implicit feedback models. In S. McDonald & J. Tait (Eds.), Proceedings of the 26th European Conference on IR Research (ECIR) (pp. 311–326). Sunderland, UK.Google Scholar