A Methodology for Evaluating Aggregated Search Results

Arguello, Jaime; Diaz, Fernando; Callan, Jamie; Carterette, Ben

doi:10.1007/978-3-642-20161-5_15

Jaime Arguello²¹,
Fernando Diaz²²,
Jamie Callan²¹ &
…
Ben Carterette²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6611))

Included in the following conference series:

European Conference on Information Retrieval

6817 Accesses
17 Citations

Abstract

Aggregated search is the task of incorporating results from different specialized search services, or verticals, into Web search results. While most prior work focuses on deciding which verticals to present, the task of deciding where in the Web results to embed the vertical results has received less attention. We propose a methodology for evaluating an aggregated set of results. Our method elicits a relatively small number of human judgements for a given query and then uses these to facilitate a metric-based evaluation of any possible presentation for the query. An extensive user study with 13 verticals confirms that, when users prefer one presentation of results over another, our metric agrees with the stated preference. By using Amazon’s Mechanical Turk, we show that reliable assessments can be obtained quickly and inexpensively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Arguello, J., Diaz, F., Callan, J., Crespo, J.-F.: Sources of evidence for vertical selection. In: SIGIR 2009, pp. 315–322. ACM, New York (2009)
Google Scholar
Arguello, J., Diaz, F., Paiement, J.-F.: Vertical selection in the presence of unlabeled verticals. In: SIGIR 2010, pp. 691–698. ACM, New York (2010)
Google Scholar
Carterette, B., Bennett, P.N., Chickering, D.M., Dumais, S.T.: Here or there: preference judgments for relevance. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 16–27. Springer, Heidelberg (2008)
Chapter Google Scholar
Cohen, J.: A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20(1), 37–46 (1960)
Article Google Scholar
Diaz, F.: Integration of news content into web results. In: WSDM 2009, pp. 182–191. ACM, New York (2009)
Google Scholar
Diaz, F., Arguello, J.: Adaptation of offline vertical selection predictions in the presence of user feedback. In: SIGIR 2009, pp. 323–330. ACM, New York (2009)
Google Scholar
Fleiss, J.: Measuring nominal scale agreement among many raters. Psychological Bulletin. 76(5), 378–382 (1971)
Article Google Scholar
Kumar, R., Vassilvitskii, S.: Generalized distances between rankings. In: WWW 2010, pp. 571–580. ACM, New York (2010)
Google Scholar
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977)
Article MATH Google Scholar
Li, X., Wang, Y.-Y., Acero, A.: Learning query intent from regularized click graphs. In: SIGIR 2008, pp. 339–346. ACM, New York (2008)
Google Scholar
Sanderson, M., Paramita, M.L., Clough, P., Kanoulas, E.: Do user preferences and evaluation measures line up? In: SIGIR 2010, pp. 555–562. ACM, New York (2010)
Google Scholar
Schulze, M.: A new monotonic, clone-independent, reversal symmetric, and condorcet-consistent single-winner election method. Social Choice and Welfare (July 2010)
Google Scholar
Sushmita, S., Joho, H., Lalmas, M., Villa, R.: Factors affecting click-through behavior in aggregated search interfaces. In: CIKM 2010, pp. 519–528. ACM, New York (2010)
Google Scholar
Thomas, P., Hawking, D.: Evaluation by comparing result sets in context. In: CIKM 2006, pp. 94–101. ACM, New York (2006)
Google Scholar
Zhu, D., Carterette, B.: An analysis of assessor behavior in crowdsourced preference judgements. In: SIGIR Workshop on Crowdsourcing for Search Evaluation, pp. 21–26. ACM, New York (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, USA
Jaime Arguello & Jamie Callan
Yahoo! Research, USA
Fernando Diaz
University of Delaware, USA
Ben Carterette

Authors

Jaime Arguello
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Diaz
View author publications
You can also search for this author in PubMed Google Scholar
Jamie Callan
View author publications
You can also search for this author in PubMed Google Scholar
Ben Carterette
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Information School, University of Sheffield, Regent Court, 211 Portobello Street, S1 4DP, Sheffield, UK
Paul Clough
CLARITY: Centre for Sensor Web Technologies, School of Computing, Dublin City University, Glasnevin, Dublin 9, Ireland
Colum Foley , Cathal Gurrin & Hyowon Lee , &
Centre for Next Generation Localisation, School of Computing, Dublin City University, Glasnevin, Dublin 9, Ireland
Gareth J. F. Jones
TNO Human Factors, Brassersplein 2, 2612 CT, Delft, The Netherlands
Wessel Kraaij
Yahoo! Research, 177 Diagonal, 08018, Barcelona, Spain
Vanessa Mudoch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Arguello, J., Diaz, F., Callan, J., Carterette, B. (2011). A Methodology for Evaluating Aggregated Search Results. In: Clough, P., et al. Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20161-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-20161-5_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20160-8
Online ISBN: 978-3-642-20161-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics