Digital Libraries for Experimental Data: Capturing Process through Sheer Curation

  • Mark Hedges
  • Tobias Blanke
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8092)


This paper presents an approach to the ‘sheer curation’ of experimental data and processes of a group of researchers in the life sciences, which involves embedding data capture and interpretation within researchers’ working practices, so that it is automatic and invisible to the researcher. The environment described does not capture just individual datasets, but the entire workflow that represents the ‘story’ of the experiment, including intermediate files and provenance metadata, so as to support the verification and reproduction of published results. As the curation environment is decoupled from the researchers’ processing environment, a provenance graph is inferred from a variety of domain-specific contextual information as the data is generated, using software that implements the knowledge and expertise of the researchers.


Digital Library Digital Preservation Data Provenance Sheer Curation Digital Curation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Greenberg, J., White, H.C., Carrier, S., Scherle, R.: A metadata best practice for a scientific data repository. Journal of Library Metadata 9(3-4), 194–212 (2009)CrossRefGoogle Scholar
  2. 2.
    Monastersky, R.: Publishing frontiers: The library reboot. Nature 495, 430–432 (2013)CrossRefGoogle Scholar
  3. 3.
    Beagrie, N.: Digital curation for science, digital libraries, and individuals. International Journal of Digital Curation 1(1), 3–16 (2006)Google Scholar
  4. 4.
    Higgins, S.: The DCC curation lifecycle model. International Journal of Digital Curation 3(1), 134–140 (2008)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Shearer, K.: Survey of digital preservation practices in Canada, Library and Archives Canada. Technical report (2009)Google Scholar
  6. 6.
    Key Perspectives Ltd. Data dimensions: disciplinary differences in research data sharing, reuse and long term viability. Scarp Synthesis Study. Technical report (2010)Google Scholar
  7. 7.
    Lyon, E., Rusbridge, C., Neilson, C., Whyte, A.: Disciplinary approaches to sharing, curation, reuse and preservation, DCC SCARP final report. Technical report (2010)Google Scholar
  8. 8.
    Whyte, A., Job, D., Giles, S., Lawrie, S.: Meeting curation challenges in a neuroimaging group. The International Journal of Digital Curation 1(3) (2008)Google Scholar
  9. 9.
    Curry, E., Freitas, A., O’Riin, S.: The role of community-driven data curation for enterprises. In: Wood, D. (ed.) Linking Enterprise Data, Part 1, pp. 25–47. Springer US, Boston (2008)Google Scholar
  10. 10.
    PREMIS data dictionary for preservation metadata v. 2.1. Technical report (2011)Google Scholar
  11. 11.
    Simmhan, Y., Plale, B., Gannon, D.: A survey of data provenance in e-science. SIGMOD Record 34(3), 31–36 (2005)CrossRefGoogle Scholar
  12. 12.
    Moreau, L., Clifford, B., Freire, J., Futrelle, J., Gil, Y., Groth, P., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., et al.: The Open Provenance Model core specification (v1. 1). Future Generation Computer Systems 27(6), 743–756 (2011)CrossRefGoogle Scholar
  13. 13.
    Moreau, L., Groth, P.: Open Provenance Model (OPM) XML Schema Specification (2010),
  14. 14.
    Zhao, J.: Open Provenance Model Vocabulary Specification (2010),
  15. 15.
    Rumsey, A.S. (ed.): Sustainable economics for a digital planet: Ensuring long-term access to digital information, final report of the Blue Ribbon Task Force on Sustainable Digital Preservation and Access. Technical report (2010)Google Scholar
  16. 16.
    Borgman, C.L.: Scholarship in the digital age: Information, infrastructure, and the Internet. MIT Press, Cambridge (2007)Google Scholar
  17. 17.
    Belhajjame, K., Wolstencroft, K., Corcho, O., Oinn, T., Tanoh, F., William, A., Goble, C.: Metadata management in the Taverna workflow system. In: 8th IEEE International Symposium on Cluster Computing and the Grid, CCGRID 2008, pp. 651–656 (2008)Google Scholar
  18. 18.
    Weise, A., Hasan, A., Hedges, M., Jensen, J.: Managing provenance in iRODS. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009, Part II. LNCS, vol. 5545, pp. 667–676. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  19. 19.
    Schraefel, M.C., Karger, D.: The pathetic fallacy of RDF. In: International Workshop on the Semantic Web and User Interaction (SWUI), vol. 2006 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Mark Hedges
    • 1
  • Tobias Blanke
    • 1
  1. 1.Centre for e-Research, Department of Digital HumanitiesKing’s College LondonUK

Personalised recommendations