Abstract
This paper presents an approach to the ‘sheer curation’ of experimental data and processes of a group of researchers in the life sciences, which involves embedding data capture and interpretation within researchers’ working practices, so that it is automatic and invisible to the researcher. The environment described does not capture just individual datasets, but the entire workflow that represents the ‘story’ of the experiment, including intermediate files and provenance metadata, so as to support the verification and reproduction of published results. As the curation environment is decoupled from the researchers’ processing environment, a provenance graph is inferred from a variety of domain-specific contextual information as the data is generated, using software that implements the knowledge and expertise of the researchers.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Greenberg, J., White, H.C., Carrier, S., Scherle, R.: A metadata best practice for a scientific data repository. Journal of Library Metadata 9(3-4), 194–212 (2009)
Monastersky, R.: Publishing frontiers: The library reboot. Nature 495, 430–432 (2013)
Beagrie, N.: Digital curation for science, digital libraries, and individuals. International Journal of Digital Curation 1(1), 3–16 (2006)
Higgins, S.: The DCC curation lifecycle model. International Journal of Digital Curation 3(1), 134–140 (2008)
Shearer, K.: Survey of digital preservation practices in Canada, Library and Archives Canada. Technical report (2009)
Key Perspectives Ltd. Data dimensions: disciplinary differences in research data sharing, reuse and long term viability. Scarp Synthesis Study. Technical report (2010)
Lyon, E., Rusbridge, C., Neilson, C., Whyte, A.: Disciplinary approaches to sharing, curation, reuse and preservation, DCC SCARP final report. Technical report (2010)
Whyte, A., Job, D., Giles, S., Lawrie, S.: Meeting curation challenges in a neuroimaging group. The International Journal of Digital Curation 1(3) (2008)
Curry, E., Freitas, A., O’Riin, S.: The role of community-driven data curation for enterprises. In: Wood, D. (ed.) Linking Enterprise Data, Part 1, pp. 25–47. Springer US, Boston (2008)
PREMIS data dictionary for preservation metadata v. 2.1. Technical report (2011)
Simmhan, Y., Plale, B., Gannon, D.: A survey of data provenance in e-science. SIGMOD Record 34(3), 31–36 (2005)
Moreau, L., Clifford, B., Freire, J., Futrelle, J., Gil, Y., Groth, P., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., et al.: The Open Provenance Model core specification (v1. 1). Future Generation Computer Systems 27(6), 743–756 (2011)
Moreau, L., Groth, P.: Open Provenance Model (OPM) XML Schema Specification (2010), http://openprovenance.org/model/opmx-20101012
Zhao, J.: Open Provenance Model Vocabulary Specification (2010), http://purl.org/net/opmv/ns-20101006
Rumsey, A.S. (ed.): Sustainable economics for a digital planet: Ensuring long-term access to digital information, final report of the Blue Ribbon Task Force on Sustainable Digital Preservation and Access. Technical report (2010)
Borgman, C.L.: Scholarship in the digital age: Information, infrastructure, and the Internet. MIT Press, Cambridge (2007)
Belhajjame, K., Wolstencroft, K., Corcho, O., Oinn, T., Tanoh, F., William, A., Goble, C.: Metadata management in the Taverna workflow system. In: 8th IEEE International Symposium on Cluster Computing and the Grid, CCGRID 2008, pp. 651–656 (2008)
Weise, A., Hasan, A., Hedges, M., Jensen, J.: Managing provenance in iRODS. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009, Part II. LNCS, vol. 5545, pp. 667–676. Springer, Heidelberg (2009)
Schraefel, M.C., Karger, D.: The pathetic fallacy of RDF. In: International Workshop on the Semantic Web and User Interaction (SWUI), vol. 2006 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hedges, M., Blanke, T. (2013). Digital Libraries for Experimental Data: Capturing Process through Sheer Curation. In: Aalberg, T., Papatheodorou, C., Dobreva, M., Tsakonas, G., Farrugia, C.J. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2013. Lecture Notes in Computer Science, vol 8092. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40501-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-40501-3_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40500-6
Online ISBN: 978-3-642-40501-3
eBook Packages: Computer ScienceComputer Science (R0)