Using Provenance to Support Good Laboratory Practice in Grid Environments

Part of the Studies in Computational Intelligence book series (SCI, volume 426)

Abstract

Conducting experiments and documenting results is daily business of scientists. Good and traceable documentation enables other scientists to confirm procedures and results for increased credibility. Documentation and scientific conduct are regulated and termed as “good laboratory practice.” Laboratory notebooks are used to record each step in conducting an experiment and processing data. Originally, these notebooks were paper based. Due to computerised research systems, acquired data became more elaborate, thus increasing the need for electronic notebooks with data storage, computational features and reliable electronic documentation. As a new approach to this, a scientific data management system (DataFinder) is enhanced with features for traceable documentation. Provenance recording is used to meet requirements of traceability, and this information can later be queried for further analysis. DataFinder has further important features for scientific documentation: It employs a heterogeneous and distributed data storage concept. This enables access to different types of data storage systems (e. g. Grid data infrastructure, file servers). In this chapter we describe a number of building blocks that are available or close to finished development. These components are intended for assembling an electronic laboratory notebook for use in Grid environments, while retaining maximal flexibility on usage scenarios as well as maximal compatibility overlap towards each other. Through the usage of such a system, provenance can successfully be used to trace the scientific workflow of preparation, execution, evaluation, interpretation and archiving of research data. The reliability of research results increases and the research process remains transparent to remote research partners.

Keywords

Data Item Data Management System Grid Environment Storage Server Good Laboratory Practice 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gremlin graph traversal language Web Site, https://github.com/tinkerpop/gremlin/wiki
  2. 2.
    mbllab–Das elektronische Laborbuch, http://elektronisches-laborbuch.de/
  3. 3.
    Neo4j Graph Database Web Site, http://neo4j.org/
  4. 4.
    Note Book Maker for PC and Mac, The World Leader in Virtual NoteBooks, http://www.notebookmaker.com
  5. 5.
    Buneman, P., Khanna, S., Tan, W.C.: Why and Where: A Characterization of Data Provenance. Tech. rep., University of Pennsylvania (2001), http://repository.upenn.edu/cis_papers/210/
  6. 6.
    Groth, P., Miles, S., Tan, V., Moreau, L.: Architecture for Provenance Systems (2005), http://eprints.ecs.soton.ac.uk/11310/
  7. 7.
    Holland, D.A., Braun, U., Maclean, D., Muniswamy-Reddy, K.K., Seltzer, M.I.: Choosing a Data Model and Query Language for Provenance. In: Proceedings of the 4th International Provenance and Annotation Workshop, IPAW (2008), doi:10.1.1.152.3820Google Scholar
  8. 8.
    Inter-Organization Programme for the Sound Management of Chemicals (IOMC): No 1: OECD Principles on Good Laboratory Practice (1998), http://www.oecd.org/document/63/0,2340,en_2649_34381_2346175_1_1_1_37465,00.html
  9. 9.
    Kloss, G.K.: MataNui Project, http://launchpad.net/matanui (last accessed June 2011)
  10. 10.
    Kloss, G.K.: MataNui – Building a Grid Data Infrastructure that “doesn’t suck!”. In: Proceedings of the 1st New Zealand eResearch Symposium, Auckland, New Zealand (2010)Google Scholar
  11. 11.
    Merriam Webster, I. (ed.): Merriam-Webster Online Dictionary. Merriam-Webster, Incorporated (2010)Google Scholar
  12. 12.
    Moreau, L.: The Foundations for Provenance on the Web. Foundations and Trends in Web Science 2(2-3), 99–241 (2010), http://eprints.ecs.soton.ac.uk/21691/ MathSciNetCrossRefGoogle Scholar
  13. 13.
    Moreau, L., Clifford, B., Freire, J., Futrelle, J., Gil, Y., Groth, P., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., Plale, B., Simmhan, Y., Stephan, E., den Bussche, J.V.: The Open Provenance Model core specification (v1.1). Future Generation Computer Systems 27(6), 743–756 (2010), http://openprovenance.org/, doi:10.1016/j.future.2010.07.005CrossRefGoogle Scholar
  14. 14.
    Moreau, L., Clifford, B., Freire, J., Gil, Y., Groth, P., Futrelle, J., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., Simmhan, Y., Stephan, E., den Bussche, J.V.: The Open Provenance Model—Core Specification (v1.1). Future Generation Computer Systems 27, 743–756 (2010), http://eprints.ecs.soton.ac.uk/21449/, doi:10.1016/j.future.2010.07.005CrossRefGoogle Scholar
  15. 15.
    Munroe, S., Miles, S., Groth, P., Jiang, S., Tan, V., Moreau, L., Ibbotson, J., Vazquez-Salceda, J.: PrIMe: A Methodology for Developing Provenance-Aware Applications. Tech. rep., Grid-Provenance Project, Southampton, UK (2006), http://eprints.ecs.soton.ac.uk/13215/
  16. 16.
    Ney, M.: Enabling a data management system to support the good laboratory practice. Master’s thesis, Free University of Berlin (2011), https://wiki.sistec.dlr.de/DataFinderOpenSource/LaboratoryNotebook
  17. 17.
    Schlauch, T., Schreiber, A.: DataFinder – A Scientific Data Management Solution. In: Proceedings of Symposium for Ensuring Long-Term Preservation and Adding Value to Scientific and Technical Data 2007 (PV), Oberpfaffenhofen, Germany (2007)Google Scholar
  18. 18.
    Simmhan, Y., Groth, P., Moreau, L.: Special Section: The third provenance challenge on using the open provenance model for interoperability. Future Generation Computer Systems 27(6), 737–742 (2011), http://www.sciencedirect.com/science/article/pii/S0167739X100%02402, doi:10.1016/j.future.2010.11.020CrossRefGoogle Scholar
  19. 19.
    Simmhan, Y.L., Plale, B., Gannon, D.: A Survey of Data Provenance Techniques. Tech. rep., Computer Science Department, Indiana University, Bloomington, IN, USA (2005), doi:10.1.1.70.6294Google Scholar
  20. 20.
    The Data Finder Team: DataFinder Project, http://launchpad.net/datafinder (last accessed June 2011)
  21. 21.
    Tylissanakis, G., Cotronis, Y.: Data Provenance and Reproducibility in Grid Based Scientific Workflows. In: Workshops at the Grid and Pervasive Computing Conference, pp. 42–49 (2009), doi:10.1109/GPC.2009.16Google Scholar
  22. 22.
    Wehmeier, S. (ed.): Oxford Advanced Learners Dictionary, 6th edn. Oxford University Press (2000)Google Scholar
  23. 23.
    Wendel, H.: Using Provenance to Trace Software Development Processes. Master’s thesis, University of Bonn, Bonn, Germany (2010), http://elib.dlr.de/64835/
  24. 24.
    Zhang, S., Coddington, P., Wendelborn, A.: Connecting arbitrary data resources to the Grid. In: Proceedings of the 11th International Conference on Grid Computing (Grid 2010). ACM/IEEE, Brussels (2010)Google Scholar
  25. 25.
    Zhang, S., Kloss, G.K., Behnke, L.: Griffin Project (2011), https://projects.arcs.org.au/trac/griffin (last accessed March 2011)

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Simulation and Software TechnologyGerman Aerospace CentreBerlinGermany
  2. 2.School of Computing + Mathematical SciencesAuckland University of TechnologyAucklandNew Zealand

Personalised recommendations