Skip to main content

repro_eval: A Python Interface to Reproducibility Measures of System-Oriented IR Experiments

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12657))

Included in the following conference series:


In this work we introduce repro_eval - a tool for reactive reproducibility studies of system-oriented Information Retrieval (IR) experiments. The corresponding Python package provides IR researchers with measures for different levels of reproduction when evaluating their systems’ outputs. By offering an easily extensible interface, we hope to stimulate common practices when conducting a reproducibility study of system-oriented IR experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others


  1. 1.

  2. 2.

  3. 3. Previous versions of the policy basically swapped the meaning of the two terms reproducibility and replicability, which is why we used the terms vice versa in earlier studies.

  4. 4.

  5. 5.

  6. 6.


  1. Agosti, Maristella, Di Nunzio, Giorgio Maria, Ferro, Nicola, Silvello, Gianmaria: An innovative approach to data management and curation of experimental data generated through IR test collections. In: Ferro, N., Peters, C. (eds.) Information Retrieval Evaluation in a Changing World. TIRS, vol. 41, pp. 105–122. Springer, Cham (2019).

    Chapter  Google Scholar 

  2. Baker, M.: 1,500 scientists lift the lid on reproducibility. Nature 533, 452–454 (2016)

    Article  Google Scholar 

  3. Breuer, T., et al.: How to measure the reproducibility of system-oriented IR experiments. In: Huang, J., et al. (eds.) Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, Virtual Event, China, 25–30 July 2020, pp. 349–358. ACM (2020).

  4. Chirigati, F., Rampin, R., Shasha, D.E., Freire, J.: Reprozip: computational reproducibility with ease. In: Özcan, F., Koutrika, G., Madden, S. (eds.) Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, 26 June–01 July 2016, pp. 2085–2088. ACM (2016).

  5. Clancy, R., Ferro, N., Hauff, C., Lin, J., Sakai, T., Wu, Z.Z.: The SIGIR 2019 open-source IR replicability challenge (OSIRRC 2019). In: Piwowarski, B., Chevalier, M., Gaussier, É., Maarek, Y., Nie, J., Scholer, F. (eds.) Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, 21–25 July 2019, pp. 1432–1434. ACM (2019).

  6. Ferro, N.: Reproducibility challenges in information retrieval evaluation. J. Data Inf. Qual. 8(2), 8:1–8:4 (2017).

  7. Gysel, C.V., de Rijke, M.: Pytrec\(\_\)eval: an extremely fast python interface to trec\(\_\)eval. In: Collins-Thompson, K., Mei, Q., Davison, B.D., Liu, Y., Yilmaz, E. (eds.) The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, 08–12 July 2018, pp. 873–876. ACM (2018).

  8. Hopfgartner, F., et al.: Evaluation-as-a-service for the computational sciences: overview and outlook. ACM J. Data Inf. Qual. 10(4), 15:1–15:32 (2018).

  9. McPhillips, T.M., et al.: Yesworkflow: a user-oriented, language-independent tool for recovering workflow information from scripts. CoRR abs/1502.02403 (2015).

  10. Murta, L., Braganholo, V., Chirigati, F., Koop, D., Freire, J.: noWorkflow: capturing and analyzing provenance of scripts. In: Ludäscher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 71–83. Springer, Cham (2015).

    Chapter  Google Scholar 

  11. Palma, R., Hołubowicz, P., Corcho, O., Gómez-Pérez, J.M., Mazurek, C.: ROHub — a digital library of research objects supporting scientists towards reproducible science. In: Presutti, V., et al. (eds.) SemWebEval 2014. CCIS, vol. 475, pp. 77–82. Springer, Cham (2014).

    Chapter  Google Scholar 

  12. Potthast, M., Gollub, T., Wiegmann, M., Stein, B.: TIRA integrated research architecture. In: Ferro, N., Peters, C. (eds.) Information Retrieval Evaluation in a Changing World. TIRS, vol. 41, pp. 123–160. Springer, Cham (2019).

    Chapter  Google Scholar 

  13. Rauber, A., Miksa, T., Mayer, R., Pröll, S.: Repeatability and re-usability in scientific processes: process context, data identification and verification. In: Kalinichenko, L.A., Starkov, S. (eds.) Selected Papers of the XVII International Conference on Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2015), Obninsk, Russia, 13–16 October 2015. CEUR Workshop Proceedings, vol. 1536, pp. 246–256. (2015).

  14. Virtanen, P., et al.: SciPy: Scipy 1.0-fundamental algorithms for scientific computing in python. CoRR abs/1907.10121 (2019).

  15. Voorhees, E.M., Rajput, S., Soboroff, I.: Promoting repeatability through open runs. In: Yilmaz, E., Clarke, C.L.A. (eds.) Proceedings of the Seventh International Workshop on Evaluating Information Access, EVIA 2016, a Satellite Workshop of the NTCIR-12 Conference, National Center of Sciences, Tokyo, Japan, 7 June 2016. National Institute of Informatics (NII) (2016).

  16. van der Walt, S., Colbert, S.C., Varoquaux, G.: The numpy array: a structure for efficient numerical computation. Comput. Sci. Eng. 13(2), 22–30 (2011).

Download references


This paper was partially supported by the EU Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 893667, and by the German Research Foundation (No. 407518790).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Timo Breuer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Breuer, T., Ferro, N., Maistro, M., Schaer, P. (2021). repro_eval: A Python Interface to Reproducibility Measures of System-Oriented IR Experiments. In: Hiemstra, D., Moens, MF., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds) Advances in Information Retrieval. ECIR 2021. Lecture Notes in Computer Science(), vol 12657. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-72239-5

  • Online ISBN: 978-3-030-72240-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics