“THAT’s What I Was Looking For”: Comparing User-Rated Relevance with Search Engine Rankings

  • Sameer Patil
  • Sherman R. Alpert
  • John Karat
  • Catherine Wolf
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3585)


We present a lightweight tool to compare the relevance ranking provided by a search engine to the relevance as actually judged by the user performing the query. Using the tool, we conducted a user study with two different versions of the search engine for a large corporate web site with more than 1.8 million pages, and with the popular search engine GoogleTM. Our tool provides an inexpensive and efficient way to do this comparison, and can be easily extended to any search engine that provides an API. Relevance feedback from actual users can be used to assess precision and recall of a search engine’s retrieval algorithms and, perhaps more importantly, to tune its relevance ranking algorithms to better match user needs. We found the tool to be quite effective at comparing different versions of the same search engine, and for benchmarking by comparing against a standard.


Search Engine Relevance Feedback Relevance Judgment Result List Target Document 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Borlund, P.: The Concept of Relevance in IR. Journal of The American Society for Information Science and Technology 54(10), 913–925 (2003)CrossRefGoogle Scholar
  2. 2.
    Bray, T.: On Search: Precision and Recall (2003),
  3. 3.
    Della Mea, V., Mizzaro, S.: Measuring Retrieval Effectiveness: A New Proposal and a First Experiemental Validation. Journal of the American Society for Information Science and Technology 55(6), 530–543 (2004)CrossRefGoogle Scholar
  4. 4.
    Dziadosz, S., Chandrasekar, R.: Do Thumbnails Previews Help Users Make Better Relevance Decisions about Web Search Results? In: Proc. SIGIR 2002 (2002)Google Scholar
  5. 5.
    Eisenberg, E.: Measuring Relevance Judgments. Information Processing and Management 24(4), 373–389 (1988)CrossRefGoogle Scholar
  6. 6.
    Janes, J.W.: Other People’s Judgments: A Comparison of Users’ and Others’ Judgments of Document Relevance, Topicality, and Utility. Journal of the American Society for Information Science 45(3), 160–171 (1994)CrossRefGoogle Scholar
  7. 7.
    Hersh, W., Turpin, A., Price, S., Chan, B., Kramer, D., Sacherek, L., Olson, D.: Do Batch and User Evaluations Give the Same Results? In: Proc. SIGIR 2000 (2000)Google Scholar
  8. 8.
    Leroy, G., Lally, A., Chen, H.: The Use of Dynamic Contexts to Improve Casual Internet Searching. ACM Transactions on Information Systems 21(3), 229–253 (2003)CrossRefGoogle Scholar
  9. 9.
    Liu, F., Yu, C., Meng, W.: Personalized Web Search by Mapping User Queries to Categories. In: Proc. IKM 2002, pp. 558–565 (2002)Google Scholar
  10. 10.
    Karat, J., Wolf, C., Alpert, S.R., Velderman, P., Patil, S.: Improving Search on, IBM Research Technical Report (2003)Google Scholar
  11. 11.
    Kyung-Sun, K., Allen, B.: Cognitive and Task Influence on Web Searching Behavior. Journal of the American Society for Information Science and Technology 53(2), 109–119 (2002)CrossRefGoogle Scholar
  12. 12.
    Muramatsu, J., Pratt, W.: Transparent Queries: Investigating Users’ Mental Models of Search Engines. In: Proc. SIGIR 2001 (2001)Google Scholar
  13. 13.
    Nielsen, J. (2004) When Search Engines Become Answer Engines, Alertbox, August, 24 (2004),
  14. 14.
    Paek, T., Dumais, S., Logan, R.: Wavelens: A New View onto Internet Search Results. In: Proc. CHI 2004 (2004)Google Scholar
  15. 15.
    Spink, A.: A User-Centered Approach to Evaluating Human Interaction with Web Search Engines: An Exploratory Study. Information Processing and Management: An International Journal 38(3), 401–426 (2002)CrossRefzbMATHGoogle Scholar
  16. 16.
    Spink, A., Greisdorf, H.: Regions and Levels: Measuring and Mapping Users’ Relevance Judgments. Journal of the American Society for Information Science and Technology 52(2), 161–173 (2001)CrossRefGoogle Scholar
  17. 17.
    Spink, A., Saracevic, T.: Interaction in Information Retrieval: Selection and Effectiveness of Search Terms. Journal of the American Society for Information Science 48(8), 741–761 (1997)CrossRefGoogle Scholar
  18. 18.
    Spink, A., Wolfram, D., Jansen, M.B.J., Saracevic, T.: Searching the Web: The Public and Their Queries. Journal of the American Society for Information Science and Technology 52(3), 226–234 (2001)CrossRefGoogle Scholar
  19. 19.
    Wolf, C.G., Alpert, S.R., Vergo, J.G., Kozakov, L., Doganata, Y.: Summarizing Technical Support Documents for Search: Expert and User Studies. IBM Systems Journal 43(3) (2004),
  20. 20.
    Yaltaghian, B., Chignell, M.: Re-ranking Search using Network Analysis: A Case Study with Google. In: Proc. IBM Centre for Advanced Studies Conference (2002)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2005

Authors and Affiliations

  • Sameer Patil
    • 1
  • Sherman R. Alpert
    • 2
  • John Karat
    • 2
  • Catherine Wolf
    • 2
  1. 1.Department of Informatics, Donald Bren School of Information and Computer SciencesUniversity of CaliforniaIrvineUSA
  2. 2.I.B.M. T. J. Watson Research CenterHawthorneUSA

Personalised recommendations