A Methodology for Evaluating Aggregated Search Results

  • Jaime Arguello
  • Fernando Diaz
  • Jamie Callan
  • Ben Carterette
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6611)


Aggregated search is the task of incorporating results from different specialized search services, or verticals, into Web search results. While most prior work focuses on deciding which verticals to present, the task of deciding where in the Web results to embed the vertical results has received less attention. We propose a methodology for evaluating an aggregated set of results. Our method elicits a relatively small number of human judgements for a given query and then uses these to facilitate a metric-based evaluation of any possible presentation for the query. An extensive user study with 13 verticals confirms that, when users prefer one presentation of results over another, our metric agrees with the stated preference. By using Amazon’s Mechanical Turk, we show that reliable assessments can be obtained quickly and inexpensively.


Discordant Pair Preference Judgement Majority Preference Vertical Block Commercial Search Engine 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arguello, J., Diaz, F., Callan, J., Crespo, J.-F.: Sources of evidence for vertical selection. In: SIGIR 2009, pp. 315–322. ACM, New York (2009)Google Scholar
  2. 2.
    Arguello, J., Diaz, F., Paiement, J.-F.: Vertical selection in the presence of unlabeled verticals. In: SIGIR 2010, pp. 691–698. ACM, New York (2010)Google Scholar
  3. 3.
    Carterette, B., Bennett, P.N., Chickering, D.M., Dumais, S.T.: Here or there: preference judgments for relevance. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 16–27. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  4. 4.
    Cohen, J.: A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20(1), 37–46 (1960)CrossRefGoogle Scholar
  5. 5.
    Diaz, F.: Integration of news content into web results. In: WSDM 2009, pp. 182–191. ACM, New York (2009)Google Scholar
  6. 6.
    Diaz, F., Arguello, J.: Adaptation of offline vertical selection predictions in the presence of user feedback. In: SIGIR 2009, pp. 323–330. ACM, New York (2009)Google Scholar
  7. 7.
    Fleiss, J.: Measuring nominal scale agreement among many raters. Psychological Bulletin. 76(5), 378–382 (1971)CrossRefGoogle Scholar
  8. 8.
    Kumar, R., Vassilvitskii, S.: Generalized distances between rankings. In: WWW 2010, pp. 571–580. ACM, New York (2010)Google Scholar
  9. 9.
    Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159–174 (1977)CrossRefzbMATHGoogle Scholar
  10. 10.
    Li, X., Wang, Y.-Y., Acero, A.: Learning query intent from regularized click graphs. In: SIGIR 2008, pp. 339–346. ACM, New York (2008)Google Scholar
  11. 11.
    Sanderson, M., Paramita, M.L., Clough, P., Kanoulas, E.: Do user preferences and evaluation measures line up? In: SIGIR 2010, pp. 555–562. ACM, New York (2010)Google Scholar
  12. 12.
    Schulze, M.: A new monotonic, clone-independent, reversal symmetric, and condorcet-consistent single-winner election method. Social Choice and Welfare (July 2010)Google Scholar
  13. 13.
    Sushmita, S., Joho, H., Lalmas, M., Villa, R.: Factors affecting click-through behavior in aggregated search interfaces. In: CIKM 2010, pp. 519–528. ACM, New York (2010)Google Scholar
  14. 14.
    Thomas, P., Hawking, D.: Evaluation by comparing result sets in context. In: CIKM 2006, pp. 94–101. ACM, New York (2006)Google Scholar
  15. 15.
    Zhu, D., Carterette, B.: An analysis of assessor behavior in crowdsourced preference judgements. In: SIGIR Workshop on Crowdsourcing for Search Evaluation, pp. 21–26. ACM, New York (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jaime Arguello
    • 1
  • Fernando Diaz
    • 2
  • Jamie Callan
    • 1
  • Ben Carterette
    • 3
  1. 1.Carnegie Mellon UniversityUSA
  2. 2.Yahoo! ResearchUSA
  3. 3.University of DelawareUSA

Personalised recommendations