Black Box Evaluation for Operational Information Retrieval Applications

  • Martin Braschler
  • Melanie Imhof
  • Stefan Rietberger
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8173)


The black box application evaluation methodology described in this tutorial is applicable to a broad range of operational information retrieval (IR) applications. Contrary to popular, traditional IR evaluation approaches that are limited to measure the IR system performance on a test collection, the black box evaluation methodology considers an IR application in its entirety: the underlying system, the corresponding document collection, and its configuration/application layer. A comprehensive set of quality criteria is used to estimate the user’s perception of the application. Scores are assigned as a weighted average of results from tests that evaluate individual aspects. The methodology was validated in a small evaluation campaign. An analysis of this campaign shows a correlation between the testers’ perception of the applications and the evaluation scores. Moreover, functional weaknesses of the tested IR applications can be identified and then systematically targeted.


information retrieval application evaluation black box user perception 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Rietberger, S., Imhof, M., Braschler, M., Berendsen, R., Järvelin, A., Hansen, P., García Seco de Herrera, A., Tsikrika, T., Lupu, M., Petras, V., Gäde, M., Kleineberg, M., Choukri, K.: PROMISE deliverable 4.2: Tutorial on Evaluation in the Wild (2012)Google Scholar
  2. 2.
    Robertson, S.E., Maron, M.E., Cooper, W.S.: Probability of relevance: a unification of two competing models for document retrieval. Info. Tech: R. and.D 1, 1–21 (1982)Google Scholar
  3. 3.
    Cleverdon, C.W.: The Cranfield tests on index language devices (1967)Google Scholar
  4. 4.
    Voorhees, E.M.: The philosophy of information retrieval evaluation. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds.) CLEF 2001. LNCS, vol. 2406, pp. 355–370. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  5. 5.
    Jansen, B.J.: Search log analysis: What it is, what’s been done, how to do it (2006)Google Scholar
  6. 6.
    Blecic, D., Bangalore, N., Dorsch, J., Henderson, C., Koenig, M., Weller, A.: Using transaction log analysis to improve OPAC retrieval results (1998)Google Scholar
  7. 7.
    Kohavi, R., Henne, R., Sommerfield, D.: Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO (2007)Google Scholar
  8. 8.
    Radlinski, F., Kurup, M., Joachims, T.: How Does Clickthrough Data Reflect Retrieval Quality? (2008)Google Scholar
  9. 9.
    Dunlop, M.: Reflections on Mira: Interactive evaluation in information retrieval. J. Am. Soc. Inf. Sci. 51, 1269–1274 (2000)CrossRefGoogle Scholar
  10. 10.
    Borlund, P.: User-centered evaluation of information retrieval systems. In: Information Retrieval: Searching in the 21st Century, pp. 21–37 (2009)Google Scholar
  11. 11.
    Braschler, M., Rietberger, S., Imhof, M., Järvelin, A., Hansen, P., Lupu, M., Gäde, M., Berendsen, R., García Seco de Herrera, A.: PROMISE deliverable 2.3: Best Practices Report (2012)Google Scholar
  12. 12.
    Braschler, M., Herget, J., Pfister, J., Schäuble, P., Steinbach, M., Stuker, J.: Evaluation der Suchfunktion von Schweizer Unternehmens-Websites (2006)Google Scholar
  13. 13.
    Braschler, M., Heuwing, B., Mandel, T., Womser-Hacker, C., Herget, J., Schäuble, P., Stuker, J.: Evaluation der Suchfunktion deutscher Unternehmens-Websites (2009)Google Scholar
  14. 14.
    Peters, C., Braschler, M., Clough, P.: Multilingual Information Retrieval: From Research to Practice. Springer (2012) ISBN 3642230075Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Martin Braschler
    • 1
  • Melanie Imhof
    • 1
  • Stefan Rietberger
    • 1
  1. 1.Zurich University of Applied SciencesWinterthurSwitzerland

Personalised recommendations