Abstract
We design and validate simulators for generating queries and relevance judgments for retrieval system evaluation. We develop a simulation framework that incorporates existing and new simulation strategies. To validate a simulator, we assess whether evaluation using its output data ranks retrieval systems in the same way as evaluation using real-world data. The real-world data is obtained using logged commercial searches and associated purchase decisions. While no simulator reproduces an ideal ranking, there is a large variation in simulator performance that allows us to distinguish those that are better suited to creating artificial testbeds for retrieval experiments. Incorporating knowledge about document structure in the query generation process helps create more realistic simulators.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Azzopardi, L.: Query side evaluation: an empirical analysis of effectiveness and effort. In: SIGIR 2009, pp. 556–563. ACM, New York (2009)
Azzopardi, L., de Rijke, M., Balog, K.: Building simulated queries for known-item topics: an analysis using six European languages. In: SIGIR 2007, pp. 455–462. ACM Press, New York (2007)
Dang, V., Croft, B.W.: Query reformulation using anchor text. In: WSDM 2010, pp. 41–50. ACM Press, New York (2010)
Gordon, M.D.: Evaluating the effectiveness of information retrieval systems using simulated queries. J. American Society for Information Science and Technology 41(5), 313–323 (1990)
Harman, D.K.: The TREC test collection, chapter 2, pp. 21–52. TREC: Experiment and Evaluation in Information Retrieval (2005)
He, J., Zhai, C., Li, X.: Evaluation of methods for relative comparison of retrieval systems based on clickthroughs. In: CIKM 2009, pp. 2029–2032. ACM, New York (2009)
Hofmann, K., Huurnink, B., Bron, M., de Rijke, M.: Comparing click-through data to purchase decisions for retrieval evaluation. In: SIGIR 2010, Geneva, ACM, New York (July 2010)
Huurnink, B., Hollink, L., van den Heuvel, W., de Rijke, M.: The search behavior of media professionals at an audiovisual archive: A transaction log analysis. J. American Society for Information Science and Technology 61(6), 1180–1197 (2010)
Joachims, T.: Optimizing search engines using clickthrough data. In: KDD 2002, pp. 133–142. ACM, New York (2002)
Jordan, C., Watters, C., Gao, Q.: Using controlled query generation to evaluate blind relevance feedback algorithms. In: JCDL 2006, New York, NY, USA, pp. 286–295. ACM, New York (2006)
Keskustalo, H., Järvelin, K., Pirkola, A., Sharma, T., Lykke, M.: Test collection-based IR evaluation needs extension toward sessions–a case of extremely short queries. Inf. Retr. Technology, 63–74 (2009)
Tague, J., Nelson, M., Wu, H.: Problems in the simulation of bibliographic retrieval systems. In: SIGIR 1980, Kent, UK, pp. 236–255. Butterworth & Co., Butterworths (1981)
Tague, J.M., Nelson, M.J.: Simulation of user judgments in bibliographic retrieval systems. SIGIR Forum 16(1), 66–71 (1981)
Voorhees, E.M.: Variations in relevance judgments and the measurement of retrieval effectiveness. In: SIGIR 1998, pp. 315–323. ACM Press, New York (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Huurnink, B., Hofmann, K., de Rijke, M., Bron, M. (2010). Validating Query Simulators: An Experiment Using Commercial Searches and Purchases. In: Agosti, M., Ferro, N., Peters, C., de Rijke, M., Smeaton, A. (eds) Multilingual and Multimodal Information Access Evaluation. CLEF 2010. Lecture Notes in Computer Science, vol 6360. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15998-5_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-15998-5_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15997-8
Online ISBN: 978-3-642-15998-5
eBook Packages: Computer ScienceComputer Science (R0)