Encyclopedia of Database Systems

Living Edition
| Editors: Ling Liu, M. Tamer Özsu

Provenance and Reproducibility

  • Fernando Chirigati
  • Juliana Freire
Living reference work entry
DOI: https://doi.org/10.1007/978-1-4899-7993-3_80747-1

Synonyms

Definition

A computational experiment composed by a sequence of steps S created at time T, on environment (hardware and operating system) E, using data D is reproducible if it can be executed with a sequence of steps S′ (modified from or equal to S) at time T′ > T, on environment E′ (potentially different than E), using data D′ that is similar to (or the same as) D with consistent results [5]. Replication is a special case of reproducibility where S′ = S and D′ = D. While there is substantial disagreement on how to define reproducibility [1], in particular across different domains, in this entry, we focus on computational reproducibility, i.e., reproducibility for computational experiments or processes.

The information needed to reproduce an experiment can be obtained from its provenance: the details of how the experiment was carried out and the results it derived. For computational experiments, provenance can be systematically and transparently...

Keywords

Source Code Computational Experiment Computational Step Operating System Level Reproducible Research 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Baker M. Muddled meanings hamper efforts to fix reproducibility crisis. Nature News & Comment. 14 Jun 2006 (2016).Google Scholar
  2. 2.
    Bonnet P, Manegold S, Bjørling M, Cao W, Gonzalez J, Granados J, Hall N, Idreos S, Ivanova M, Johnson R, Koop D, Kraska T, Müller R, Olteanu D, Papotti P, Reilly C, Tsirogiannis D, Yu C, Freire J, Shasha D. Repeatability and workability evaluation of SIGMOD’2011. SIGMOD Rec. 2011;40(2):45–8.CrossRefGoogle Scholar
  3. 3.
    Claerbout J, Karrenbach M. Electronic documents give reproducible research a new meaning. In: Proceedings of the 62nd Annual International Meeting of the Society of Exploration Geophysics; 1992. p. 601–4.Google Scholar
  4. 4.
    Collberg C, Proebsting T, Warren AM. Repeatability and benefaction in computer systems research. Technical report. TR 14-04, University of Arizona; 2015.Google Scholar
  5. 5.
    Freire J, Bonnet P, Shasha D. Computational reproducibility: state-of-the-art, challenges, and database research opportunities. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, SIGMOD’12. ACM, New York; 2012. p. 593–6.Google Scholar
  6. 6.
    Knuth DE. Literate programming. Comput J. 1984;27(2):97–111.CrossRefMATHGoogle Scholar
  7. 7.
    Kovacevic J. How to encourage and publish reproducible research. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP; 2007, vol. 4, p. IV-1273–6.Google Scholar
  8. 8.
    LeVeque R. Python tools for reproducible research on hyperbolic problems. Comput Sci Eng. 2009;11(1):19–27.CrossRefGoogle Scholar
  9. 9.
    Nuzzo R. How scientists fool themselves, and how they can stop. Nature. 2015;526(7572):182–5.CrossRefGoogle Scholar
  10. 10.
    Piwowar HA, Day RS, Fridsma DB. Sharing detailed research data is associated with increased citation rate. PLoS One. 2007;2(3):e308.CrossRefGoogle Scholar
  11. 11.
    Vandewalle P, Kovacevic J, Vetterli M. Reproducible research in signal processing – what, why, and how. IEEE Signal Process Mag. 2009;26(3):37–7.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  1. 1.NYU Tandon School of EngineeringBrooklynUSA
  2. 2.NYU Tandon School of EngineeringBrooklynUSA
  3. 3.NYU Center for Data ScienceNew YorkUSA