Abstract
It is generally acknowledged that when, in 1988, John Chambers and Richard Becker incorporated the S AUDIT facility into their S statistical programming language and environment, they created one of the first provenance-aware applications. Since then, S has been spiritually succeeded by the open-source R project; however, R has no such facility for tracking provenance. This paper looks at how provenance-awareness is being introduced to CXXR (http://www.cs.kent.ac.uk/projects/cxxr), a variant of the R interpreter designed to allow creation of experimental R versions. We explore the issues surrounding recording, representing, and interrogating provenance information in a command-line driven interactive environment that utilises a lazy functional programming language. We also characterise provenance information in this domain and evaluate the impact of adding facilities for provenance tracking.
Chapter PDF
Similar content being viewed by others
References
Moreau, L., Clifford, B., Freire, J., Gil, Y., Groth, P., Futrelle, J., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., Simmhan, Y., Stephan, E., den Bussche, J.V.: The open provenance model — core specification (v1.1). Future Generation Computer Systems (December 2009)
The R Foundation: The R Project for Statistical Computing. http://www.r-project.org
Gentleman, R.: Reproducible research: A bioinformatics case study. Statistical Applications in Genetics and Molecular Biology 4(1), Article 2 (2005)
Knuth, D.E.: Literate programming. Comput. J. 27(2), 97–111 (1984)
Callahan, S.P., Freire, J., Scheidegger, C.E., Silva, C.T., Vo, H.T.: Towards provenance-enabling paraview, pp. 120–127 (2008)
Becker, R.A.: A brief history of S. Computational Statistics – Papers Collected on the Occasion of the 25th Conference on Statistical Computing at Schlosz Reisensburg, pp. 81–110 (1994)
Becker, R.A., Chambers, J.M.: Auditing of Data Analyses. SIAM Journal on Scientific and Statistical Computing 8, 747–760 (1988)
TIBCO Software Inc: Spotfire S+, http://spotfire.tibco.com
Runnalls, A.R.: CXXR project, http://www.cs.kent.ac.uk/projects/cxxr
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Silles, C.A., Runnalls, A.R. (2010). Provenance-Awareness in R. In: McGuinness, D.L., Michaelis, J.R., Moreau, L. (eds) Provenance and Annotation of Data and Processes. IPAW 2010. Lecture Notes in Computer Science, vol 6378. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17819-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-17819-1_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17818-4
Online ISBN: 978-3-642-17819-1
eBook Packages: Computer ScienceComputer Science (R0)